To measure the effects of Geographic location, Income or Population on life expectance rate
#Importing raw data
gapminder <- read.csv("~/Desktop/Harrisburg University/ANLY 506-90-O/Exploratory-Data-Analytics/Data/gapminder.csv")
#Summary of data
str(gapminder)## 'data.frame': 41284 obs. of 6 variables:
## $ Country : Factor w/ 197 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Year : int 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 ...
## $ life : num 28.2 28.2 28.2 28.2 28.2 ...
## $ population: Factor w/ 15260 levels "","1,005,328,574",..: 7490 1 1 1 1 1 1 1 1 1 ...
## $ income : num 603 603 603 603 603 603 603 603 603 603 ...
## $ region : Factor w/ 6 levels "America","East Asia & Pacific",..: 5 5 5 5 5 5 5 5 5 5 ...
Here, we study that Country, population, and income are Factors. Life and income are of num data type while year is of int datatype.
#Study the scope
summary(gapminder)## Country Year life
## Afghanistan : 216 Min. :1800 Min. : 1.00
## Albania : 216 1st Qu.:1854 1st Qu.:31.00
## Algeria : 216 Median :1908 Median :35.12
## Angola : 216 Mean :1907 Mean :42.88
## Antigua and Barbuda: 216 3rd Qu.:1962 3rd Qu.:55.60
## Argentina : 216 Max. :2015 Max. :84.10
## (Other) :39988
## population income region
## :25817 Min. : 142 America : 7961
## 121000 : 6 1st Qu.: 883 East Asia & Pacific : 6256
## 14092 : 6 Median : 1450 Europe & Central Asia :10468
## 1432000: 6 Mean : 4571 Middle East & North Africa: 4309
## 229000 : 6 3rd Qu.: 3483 South Asia : 1728
## 2574000: 6 Max. :182668 Sub-Saharan Africa :10562
## (Other):15437 NA's :2341
levels(gapminder$Country)## [1] "Afghanistan" "Ã…land"
## [3] "Albania" "Algeria"
## [5] "Andorra" "Angola"
## [7] "Antigua and Barbuda" "Argentina"
## [9] "Armenia" "Aruba"
## [11] "Australia" "Austria"
## [13] "Azerbaijan" "Bahamas"
## [15] "Bahrain" "Bangladesh"
## [17] "Barbados" "Belarus"
## [19] "Belgium" "Belize"
## [21] "Benin" "Bhutan"
## [23] "Bolivia" "Bosnia and Herzegovina"
## [25] "Botswana" "Brazil"
## [27] "Brunei" "Bulgaria"
## [29] "Burkina Faso" "Burundi"
## [31] "Cambodia" "Cameroon"
## [33] "Canada" "Cape Verde"
## [35] "Chad" "Channel Islands"
## [37] "Chile" "China"
## [39] "Colombia" "Comoros"
## [41] "Congo, Dem. Rep." "Congo, Rep."
## [43] "Costa Rica" "Cote d'Ivoire"
## [45] "Croatia" "Cuba"
## [47] "Cyprus" "Denmark"
## [49] "Djibouti" "Dominica"
## [51] "Ecuador" "Egypt"
## [53] "El Salvador" "Equatorial Guinea"
## [55] "Eritrea" "Estonia"
## [57] "Ethiopia" "Fiji"
## [59] "Finland" "France"
## [61] "French Guiana" "French Polynesia"
## [63] "Gabon" "Gambia"
## [65] "Georgia" "Germany"
## [67] "Ghana" "Greece"
## [69] "Greenland" "Grenada"
## [71] "Guadeloupe" "Guam"
## [73] "Guatemala" "Guinea"
## [75] "Guinea-Bissau" "Guyana"
## [77] "Haiti" "Honduras"
## [79] "Hong Kong, China" "Hungary"
## [81] "Iceland" "India"
## [83] "Indonesia" "Iran"
## [85] "Iraq" "Ireland"
## [87] "Israel" "Italy"
## [89] "Jamaica" "Japan"
## [91] "Jordan" "Kazakhstan"
## [93] "Kenya" "Kiribati"
## [95] "Kuwait" "Latvia"
## [97] "Lebanon" "Lesotho"
## [99] "Liberia" "Libya"
## [101] "Lithuania" "Luxembourg"
## [103] "Macao, China" "Macedonia, FYR"
## [105] "Madagascar" "Malawi"
## [107] "Malaysia" "Maldives"
## [109] "Mali" "Malta"
## [111] "Marshall Islands" "Martinique"
## [113] "Mauritania" "Mauritius"
## [115] "Mayotte" "Mexico"
## [117] "Micronesia, Fed. Sts." "Moldova"
## [119] "Mongolia" "Montenegro"
## [121] "Morocco" "Mozambique"
## [123] "Myanmar" "Namibia"
## [125] "Nepal" "Netherlands"
## [127] "Netherlands Antilles" "New Caledonia"
## [129] "New Zealand" "Nicaragua"
## [131] "Niger" "Nigeria"
## [133] "Norway" "Oman"
## [135] "Pakistan" "Panama"
## [137] "Papua New Guinea" "Paraguay"
## [139] "Peru" "Philippines"
## [141] "Poland" "Portugal"
## [143] "Puerto Rico" "Qatar"
## [145] "Reunion" "Romania"
## [147] "Russia" "Rwanda"
## [149] "Samoa" "Sao Tome and Principe"
## [151] "Saudi Arabia" "Senegal"
## [153] "Serbia" "Seychelles"
## [155] "Sierra Leone" "Singapore"
## [157] "Slovak Republic" "Slovenia"
## [159] "Solomon Islands" "Somalia"
## [161] "South Africa" "South Sudan"
## [163] "Spain" "Sri Lanka"
## [165] "Sudan" "Suriname"
## [167] "Swaziland" "Sweden"
## [169] "Switzerland" "Syria"
## [171] "Taiwan" "Tajikistan"
## [173] "Tanzania" "Thailand"
## [175] "Timor-Leste" "Togo"
## [177] "Tokelau" "Tonga"
## [179] "Trinidad and Tobago" "Tunisia"
## [181] "Turkey" "Turkmenistan"
## [183] "Uganda" "Ukraine"
## [185] "United Arab Emirates" "United Kingdom"
## [187] "United States" "Uruguay"
## [189] "Uzbekistan" "Vanuatu"
## [191] "Venezuela" "Vietnam"
## [193] "Virgin Islands (U.S.)" "West Bank and Gaza"
## [195] "Western Sahara" "Zambia"
## [197] "Zimbabwe"
Looking at the summary, now we know there are 197 countries spread across 6 regions, each having 216 entries for years from 1800 to 2015. The life expectancy ranges from 1 to 84%
#gapminder_2015 <- gapminder %>% filter(Year==2015)
#symbols(gapminder_2015$income, gapminder_2015$life, circles = gapminder_2015$population, inches=0.35, fg="white", bg="red", xlab="Income", ylab="Life Expectancy") + scale_x_log10()
#text(gapminder_2015$income, gapminder_2015$life, gapminder_2015$Country, cex=0.4)library("plotly")## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
p <- gapminder %>%
filter(Year==2015) %>%
ggplot(aes(income, life, size = population, color=region)) + geom_point() + scale_x_log10() +
theme_bw()
ggplotly(p)## Warning: Using size for a discrete variable is not advised.
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.